15 research outputs found

    DIRECTOR: Generator-Classifiers For Supervised Language Modeling

    Full text link
    Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions. The standard language modeling setup fails to address these issues. In this paper, we introduce a new architecture, {\sc Director}, that consists of a unified generator-classifier with both a language modeling and a classification head for each output token. Training is conducted jointly using both standard language modeling data, and data labeled with desirable and undesirable sequences. Experiments in several settings show that the model has competitive training and decoding speed compared to standard language models while yielding superior results, alleviating known issues while maintaining generation quality. It also outperforms existing model guiding approaches in terms of both accuracy and efficiency

    Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation

    Full text link
    Current language generation models suffer from issues such as repetition, incoherence, and hallucinations. An often-repeated hypothesis is that this brittleness of generation models is caused by the training and the generation procedure mismatch, also referred to as exposure bias. In this paper, we verify this hypothesis by analyzing exposure bias from an imitation learning perspective. We show that exposure bias leads to an accumulation of errors, analyze why perplexity fails to capture this accumulation, and empirically show that this accumulation results in poor generation quality. Source code to reproduce these experiments is available at https://github.com/kushalarora/quantifying_exposure_biasComment: Accepted in Findings of ACL 202

    The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation

    Full text link
    State-of-the-art language generation models can degenerate when applied to open-ended generation problems such as text completion, story generation, or dialog modeling. This degeneration usually shows up in the form of incoherence, lack of vocabulary diversity, and self-repetition or copying from the context. In this paper, we postulate that ``human-like'' generations usually lie in a narrow and nearly flat entropy band, and violation of these entropy bounds correlates with degenerate behavior. Our experiments show that this stable narrow entropy zone exists across models, tasks, and domains and confirm the hypothesis that violations of this zone correlate with degeneration. We then use this insight to propose an entropy-aware decoding algorithm that respects these entropy bounds resulting in less degenerate, more contextual, and "human-like" language generation in open-ended text generation settings

    Drifts in protein and RNA as influenced by Rifampicin during seed germination in Pinus kesiya L. Royal ex-Gord.

    No full text
    Effect of Rifampicin-a metabolic inhibitor on the contents of total soluble proteins, and RNA during imbibition, subsequent seed germination and seedling emergence has been studied in embryonal and extra-embryonal parts of Pinus kesiya

    X-MODDES (eXtended Multi Operator Delimiter Based Data Encryption Standard)

    No full text
    An algorithm is considered computationally secure if it can not be broken with standard resources, either current or future. In this paper we have introduced a new block cipher algorithm named X-MODDES. It is unique independent approach which uses several computational steps along with string of operators and randomized delimiter selections by using some suitable mathematical logic. X-MODDES is specially designed to produce different cipher texts by applying same key on same plain text. Thus a new protocol has been designed to encrypt a given text, which allows a higher level security as compare to MODDES. The Algorithm is successfully implemented on text file, corresponding digital image file and audio file. Here we have also tried to highlight the performance of some well known data algorithms like DES, Triple-DES, AES (Rijndael), MODDES, and compare them with the X-MODDES. Finally it has been proved that X-MODDES is one of the best performing partial symmetric key algorithm among the above mention algorithms particularly for the text message with limited size

    Macromolecular drifts associated with the effects of herbicides on the rooting of stem cuttings and rooting potential of Lantana camara L. var. aculeata

    No full text
    Rooting of stem cuttings and rooting potential of Lantana camara L. was studied alongwith the changes in protein and RNA content occuring during the rooting process in response to certain herbicides. Paraquat, butachlor, CuSO4 and 2,4,5-T completely checked the rooting of the stem cuttings. Atrazine, TCA and 2,4-D retarded the rooting response. Low as well as high doses of paraquat and butachlor and only higher dose of atrazine were completely inhibitory. CuSO4 could not check the rooting potential of the plant compared to the complete inhibition of the rooting of the stem cuttings. Paraquat, atrazine, butachlor and CuSO4 altered protein and RNA contents significantly during different stages of rhizogenesis of the stem cuttings. The significance of study is discussed in the light of controlling the vegetative reproduction of L.camara which is a noxious weed on abandoned and arable lands
    corecore